Movies have been an important aspect of society since their creation. They portray issues that reflect our culture and current events, while also providing entertainment. As a society, we have been witnessing the severe impact of the COVID-19 pandemic on the movie industry, from movie theater closures to film releases being delayed or pushed straight to streaming services. This project seeks to investigate the impact that global events - including the COVID-19 pandemic - have had on a variety of factors within the movie industry. The data from this project is mainly taken from the IMDb Movies Extensive Database, and supplemented with the IMDB 5000 Movie Dataset. Both of these datasets were populated with data from IMDb.
The midpoint_df.csv is a data set we compiled from the IMDB 5000 Movie Dataset and the IMDB Movies Extensive Database for the purpose of calculation and data visualization in this deliverable. It contains data of 1637 movies produced in the US or in conjunction with the US from 1990 to 2020. In details, the data set considers 12 variables, namely plot_keywords, genre, usa_gross_income, budget, language, country, avg_vote, reviews_from_critics, date_published, year, title_and_year.
About Table: A lot of our questions revolve around different statistics on movies across different years. Thus, our summary table looks at the number of movies published, average budget, USA gross income and most and least common genre from 1990 to 2020.
Insight: By taking our large data set and summarizing it, we are able to see more specific, cleaned up information. For example, this process revealed that our data pool only had one movie for 1992, which did not report a budget. The table displays trends throughout years as well. From 1993 to 2015, Drama was the most common genre of released movies. There is more variety in the least common genre across the years, however, History seems to come up often. Other than 1992 and 2018, the average budget of movies is in the ten millions. Again, 1992 had only one movie reported with no budget. 2018 has a budget that is significantly less than surrounding years. We can also see that 2020 had a significant decrease in average USA income. Considering the COVID-19 pandemic, global events may have an impact on the movie industry.
| Year | Number of Movies Published | Average Budget | Average USA Gross Income | Most Common Genre | Least Common Genre |
|---|---|---|---|---|---|
| 1990 | 6 | 14205000 | 55549289 | Comedy | Action |
| 1991 | 5 | 15694600 | 17693069 | Drama | Action |
| 1992 | 1 | NaN | 553171 | Action | Action |
| 1993 | 9 | 29187500 | 82194530 | Drama | Animation |
| 1994 | 18 | 35307692 | 81764922 | Drama | Animation |
| 1995 | 20 | 39078947 | 50213473 | Drama | Animation |
| 1996 | 30 | 38945385 | 53125653 | Drama | Family |
| 1997 | 37 | 44958065 | 46053871 | Drama | Family |
| 1998 | 50 | 28253488 | 36550476 | Drama | Fantasy |
| 1999 | 41 | 35045588 | 30532869 | Drama | Biography |
| 2000 | 68 | 32853279 | 29899064 | Drama | History |
| 2001 | 68 | 34096143 | 40879155 | Drama | Musical |
| 2002 | 86 | 33328267 | 41113047 | Drama | History |
| 2003 | 79 | 27669486 | 29988844 | Drama | Animation |
| 2004 | 65 | 38646542 | 47247506 | Drama | Musical |
| 2005 | 98 | 40062415 | 33537240 | Drama | War |
| 2006 | 96 | 34468125 | 25528268 | Drama | Music |
| 2007 | 85 | 37419787 | 42569300 | Drama | Musical |
| 2008 | 78 | 33990769 | 40956521 | Drama | History |
| 2009 | 81 | 40627029 | 46274694 | Drama | History |
| 2010 | 105 | 36840266 | 44933600 | Drama | History |
| 2011 | 102 | 33237211 | 30267374 | Drama | History |
| 2012 | 92 | 42853995 | 53299935 | Drama | History |
| 2013 | 75 | 55074094 | 52736999 | Drama | Sport |
| 2014 | 89 | 38249200 | 41272191 | Drama | War |
| 2015 | 72 | 44742656 | 70895405 | Drama | Fantasy |
| 2016 | 55 | 60447647 | 56733594 | Action | Family |
| 2017 | 7 | 32000000 | 12497767 | Drama | Action |
| 2018 | 5 | 7600000 | 95947786 | Crime | Family |
| 2019 | 8 | 47428571 | 58982142 | Action | Biography |
| 2020 | 6 | 19250000 | 9301084 | Horror | Adventure |
About Chart: One of the questions we wanted to answer for our data was how different national events impacted the movie industry. To answer this, this graph looks at the average revenue brought in by the movie industry by year.
Insight: The main years we decided to look at were 1991 (Introduction of the Internet), 2001 (9/11), 2005 (Introduction of Youtube), 2008 (Great Recession), and 2020 (Coronavirus Pandemic).
This line graph depicts the search results of 5 pandemic related movies relative to the highest point on the chart for their given time frame. I chose this chart because it clearly shows the popularity of these movies over time (from one year after their release date to November 2020) and helps give a sense of how quarantine brought these movies back into the spotlight. From this graph we an clearly see a boost in attention towards movies that deal with a pandemic in March 2020, even if they are more fiction than what could happen in reality (see I am Legend and 28 days later that are notably zombie movies). Every movie’s search results more than double in March of 2020 relative to November 2019 with Contagion coming in as the most different with a 9900% increase in that time frame. With respects to how this chart displays data it’s clear to see that the month that these movies have been searched the most, after the year following it’s release date, has been the month that the United States started implementing quarantine orders to get people to stay inside.
About Scatterplot: One of the questions we wanted to consider was the popularity of different movie genres over the years, and the below scatterplot answers that question through displaying the average vote (out of 10) for each movie genre from 1990 to 2020. Since most movies in our dataset had multiple genres, the avg_vote for each movie was applied to each of its genres. Please note that the scatterplot is interactive, so you can hover over each of the data points to see the year, average vote, and genre.
Insight: From the scatterplot, we can see that in 2020 (global event: COVID-19 pandemic), the highest rated genre thus far is Thriller and the lowest rated genre thus far is Mystery. In 2008 (global event: Great Recession), the highest rated genre was Biography and the lowest rated genre was Fantasy. In 2001 (global event: 9/11), the highest rated genre was Musical and the lowest rated genre was Sport. And in 1991 (global event: introduction of the Internet/World Wide Web), the highest rated genre was Biography and the lowest rated genre was Crime.
This bar chart depicts the average revenue (in the US) of the movie industry by year. Looking at the chart, we can see that there is a dip in average earnings from 1999 to 2011, but the largest drop is from 2018 to 2020. Even though there were more movies listed in the dataframe from 2020 as compared to 2018, there is still a massive drop in earnings, potentially from the coronavirus pandemic.
This line graph depicts the search results of 5 pandemic related movies relative to the highest point on the chart for their given time frame. I chose this chart because it clearly shows the popularity of these movies over time (from one year after their release date to November 2020) and helps give a sense of how quarantine brought these movies back into the spotlight. From this graph we an clearly see a boost in attention towards movies that deal with a pandemic in March 2020, even if they are more fiction than what could happen in reality (see I am Legend and 28 days later that are notably zombie movies). Every movie’s search results more than double in March of 2020 relative to November 2019 with Contagion coming in as the most different with a 9900% increase in that time frame. With respects to how this chart displays data it’s clear to see that the month that these movies have been searched the most, after the year following it’s release date, has been the month that the United States started implementing quarantine orders to get people to stay inside.
>>>>>>> b5a956b98ea4bcceddb57730f77cb92006ac69f6